39 research outputs found
Disentangled Feature Learning for Real-Time Neural Speech Coding
Recently end-to-end neural audio/speech coding has shown its great potential
to outperform traditional signal analysis based audio codecs. This is mostly
achieved by following the VQ-VAE paradigm where blind features are learned,
vector-quantized and coded. In this paper, instead of blind end-to-end
learning, we propose to learn disentangled features for real-time neural speech
coding. Specifically, more global-like speaker identity and local content
features are learned with disentanglement to represent speech. Such a compact
feature decomposition not only achieves better coding efficiency by exploiting
bit allocation among different features but also provides the flexibility to do
audio editing in embedding space, such as voice conversion in real-time
communications. Both subjective and objective results demonstrate its coding
efficiency and we find that the learned disentangled features show comparable
performance on any-to-any voice conversion with modern self-supervised speech
representation learning models with far less parameters and low latency,
showing the potential of our neural coding framework.Comment: Submitted to ICASSP202
Interactive Speech and Noise Modeling for Speech Enhancement
Speech enhancement is challenging because of the diversity of background
noise types. Most of the existing methods are focused on modelling the speech
rather than the noise. In this paper, we propose a novel idea to model speech
and noise simultaneously in a two-branch convolutional neural network, namely
SN-Net. In SN-Net, the two branches predict speech and noise, respectively.
Instead of information fusion only at the final output layer, interaction
modules are introduced at several intermediate feature domains between the two
branches to benefit each other. Such an interaction can leverage features
learned from one branch to counteract the undesired part and restore the
missing component of the other and thus enhance their discrimination
capabilities. We also design a feature extraction module, namely
residual-convolution-and-attention (RA), to capture the correlations along
temporal and frequency dimensions for both the speech and the noises.
Evaluations on public datasets show that the interaction module plays a key
role in simultaneous modeling and the SN-Net outperforms the state-of-the-art
by a large margin on various evaluation metrics. The proposed SN-Net also shows
superior performance for speaker separation.Comment: AAAI 2021 (Accepted
DasFormer: Deep Alternating Spectrogram Transformer for Multi/Single-Channel Speech Separation
For the task of speech separation, previous study usually treats
multi-channel and single-channel scenarios as two research tracks with
specialized solutions developed respectively. Instead, we propose a simple and
unified architecture - DasFormer (Deep alternating spectrogram transFormer) to
handle both of them in the challenging reverberant environments. Unlike
frame-wise sequence modeling, each TF-bin in the spectrogram is assigned with
an embedding encoding spectral and spatial information. With such input,
DasFormer is then formed by multiple repetition of simple blocks each of which
integrates 1) two multi-head self-attention (MHSA) modules alternately
processing within each frequency bin & temporal frame of the spectrogram 2)
MBConv before each MHSA for modeling local features on the spectrogram.
Experiments show that DasFormer has a powerful ability to model the
time-frequency representation, whose performance far exceeds the current SOTA
models in multi-channel speech separation, and also achieves single-channel
SOTA in the more challenging yet realistic reverberation scenario.Comment: 5 pages, accepted by ICASSP202
Notch1 is required for hypoxia-induced proliferation, invasion and chemoresistance of T-cell acute lymphoblastic leukemia cells
Background
Notch1 is a potent regulator known to play an oncogenic role in many malignancies including T-cell acute lymphoblastic leukemia (T-ALL). Tumor hypoxia and increased hypoxia-inducible factor-1α (HIF-1α) activity can act as major stimuli for tumor aggressiveness and progression. Although hypoxia-mediated activation of the Notch1 pathway plays an important role in tumor cell survival and invasiveness, the interaction between HIF-1α and Notch1 has not yet been identified in T-ALL. This study was designed to investigate whether hypoxia activates Notch1 signalling through HIF-1α stabilization and to determine the contribution of hypoxia and HIF-1α to proliferation, invasion and chemoresistance in T-ALL. Methods
T-ALL cell lines (Jurkat, Sup-T1) transfected with HIF-1α or Notch1 small interference RNA (siRNA) were incubated in normoxic or hypoxic conditions. Their potential for proliferation and invasion was measured by WST-8 and transwell assays. Flow cytometry was used to detect apoptosis and assess cell cycle regulation. Expression and regulation of components of the HIF-1α and Notch1 pathways and of genes related to proliferation, invasion and apoptosis were assessed by quantitative real-time PCR or Western blot. Results
Hypoxia potentiated Notch1 signalling via stabilization and activation of the transcription factor HIF-1α. Hypoxia/HIF-1α-activated Notch1 signalling altered expression of cell cycle regulatory proteins and accelerated cell proliferation. Hypoxia-induced Notch1 activation increased the expression of matrix metalloproteinase-2 (MMP2) and MMP9, which increased invasiveness. Of greater clinical significance, knockdown of Notch1 prevented the protective effect of hypoxia/HIF-1α against dexamethasone-induced apoptosis. This sensitization correlated with losing the effect of hypoxia/HIF-1α on Bcl-2 and Bcl-xL expression. Conclusions
Notch1 signalling is required for hypoxia/HIF-1α-induced proliferation, invasion and chemoresistance in T-ALL. Pharmacological inhibitors of HIF-1α or Notch1 signalling may be attractive interventions for T-ALL treatment